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Fatigue and drowsiness detection techniques based on the external features 
are under progress, and the methods of facial feature extraction require 
further development. This paper discusses the innovative processes, efficient 
methods, and recent advancements in the field of drowsiness and fatigue 
detection. In this proposed model, a wide application is planned in the field 
of artificial intelligence by defining the fundamentals of human-computer 
interaction, facial expression recognition and driver fatigue-sleepiness 
determination. This research outlines an efficient and effective three-phase 
strategy for detecting drowsiness. Viola Jones is used to detect facial traits in 
these three phases. Detection of yawning and tracking once the face has been 
identified, the segmenting the skin, the system becomes lighting invariant 
portion by itself, focusing on the chromatic components based on skin, and 
to reject most of non-face image backdrops. The color eye tracking and 
yawning detection are carried out by template matching with the correlation 


coefficient. The vectors of features based on each of the above phases is 
concatenated, and a binary result is obtained. The analysis of sound and 
successive frames into fatigue and non-fatigue states has been classified. If 
the time in fatigue state exceeds the threshold, the system will sound an 
alarm. 
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1. INTRODUCTION 

Nowadays, modern computing technologies have major advances in artificial intelligence. 
Brightness, blurring, individual variances in skin tone, and environmental variables all affect eye closure 
detection. It is a difficult process. Eye closure detection has various applications. Recently, driver assistance 
systems, smart car development and enhancement. Control and warning systems are examples. All artificial 
intelligence driverless vehicles are built in step with the information era. While research for driver assistance 
systems continues, solutions are generated based on need. Google, Manufacturers including Toyota, Nissan, 
BMW, and Tesla are continuing system R&D [1]. 

Currently available advanced driver-assistance systems (ADAS) Studies in numerous areas are seen 
when studied. The continual movement of road, car, and people generates life-threatening accidents [2]. 
Between 2009 and 2016, Turkey Statistics Institute (TSI) explained driver errors in road accidents as [3]. 
NHTSA (National Sleepiness is said to be the cause of 56,000 road fatalities and injuries per year [4]. 
Various studies to detect driver fatigue-sleepiness. These studies use body temperature, heart rate, and brain 
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electrical signals to assess driver alertness [5]. Studies on the human body auxiliary equipment like 
electroencephalography (EEG) header affects drivers and is difficult to integrate in real life. 

Another study [6] is to perceive the driver's vehicle's reactions. These parameters are used to 
diagnose fatigue. These parameters are determined by the vehicle's accelerator pedal and sensors such as the 
steering wheel's driving type and driver status. It is used in [6]. This type of job varies widely depending on 
the driver's attributes and the success rate is poor. Also, putting the sensors in the right places on the vehicle 
is difficult and requires expertise. These systems also have procedures that require maintenance and repair. 
Whether you're asleep or not. There are many studies on determining absence in the literature. Another way 
devised to determine driver attention is to instantaneously examine and evaluate the driver's condition. It is 
founded on will [7]. 

The "urge to nod off" is defined. This operation is the result of a natural human sleep-wake cycle. 
They both represent the sleep-wake cycle. The greater the alertness length, the more weight and problem 
sleep works [8]. A circadian pacemaker is an intrinsic biological clock that cycles. Homeostatic components 
detect sleepiness and treatment with circadian factors. These procedures normally occur 12 hours after the 
mid-sleep cycle (in the evening for the great majority of sleepers) and before a combined sleep period 
(mostly in the evening, before sleep) [9]. These cycles must be understood as normal and inevitable, not as 
something to be emulated or ignored. 


2. ALGORITHM 

These mechanisms result in deep learning and multilayer feed forward neural networks. Since its 
inception, deep learning models with many hidden layers have been dubbed this. It is used in image 
classification, description, split-split, video analysis and interpretation, audio detection and processing, and 
natural language learning. A multi-level neural network is constructed by using deep learning to extract major 
attributes from unlabeled education data. 

Convolutional neural networks (CNN's) local connection, weight sharing, and pooling sampling have 
made it a popular choice in image processing and voice semantics. In image processing, the original image can be 
immediately input into the network without complicated pre-processing. Convolutional neural networks are used 
to process images. It is a non-connected multi-layer neural network. Too many parameters overfit the network, 
preventing useful learning. Here is the convolution formula: 


aij a f (Seo Dab Wm,nXi+m,j+n ag wp) (1) 


It has many convolution layers, pooling capabilities, and fully connected layers. Each layer of the 
fully linked neural network has one dimension, while the three-dimensional neurons have width, height, and 
depth. The neurons are arranged in a layer structure of a fully linked neural network. The presence of a 
convolutional layer in the convolutional neural network is crucial. The convolutional layer's weight sharing 
reduces the number of network structure parameters. The local linkage to the convolutionary layer reduces 
the complexity of network computing. The input layer has a 1000x1000= node for a 1000x1000 picture. 
Only that layer assumes 100 nodes are the initial hidden level (1000=1000+1). 


3. PROPOSED METHOD 

Viola Jones' face detection [10]. In order to process the skin segments, the YCbCr algorithm must be 
set to process the face. In the YCbCr space, the image's color impact can be "wiped out by considering only 
the chromatic segments." In red, green, blue (RGB) model, each color (red, green, and blue) has a different 
brightness. A YCbCr picture solely contains red/blue values. Red is the colour of YCbCr, as blue (Cb) and 
red (Cr) segments have no light. The YCbCr picture is segmented into Y, Cb, and Cr data using the detects. 
However, despite the fact that the shading is concentrated in the chrominance plane, it appears to be 
distributed over a tiny area of the chrominance plane. As a result, a large percentage of the non-face image is 
immediately rejected. 

The state of the eyes is the most essential component in determining driver tiredness. When you are 
sleepy, your eyelids linger nearer to close your eyes. We utilize a computer named "Viola Jones" to position 
the driver's gaze. Because the eyes are on opposite sides of the brain, they are divided. The focal point of the 
eyes is governed by their locations. Finally, the understudy is acknowledged. If the person opens his or her 
eyes and it is normal to the state in which the condition is not tested, it is seen as normal. Table 1 shows that 
different states' eyes have varied features. The distinction between fully open and half-open eyes is 
sometimes misunderstood, resulting in erroneous cautions, and the driver's head's fluctuating development 
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might result in disappointment. Figure 1 shows the schematic diagram of the eye positioning and the steps 
involved in the process. 


Table 1. Feature matrix of eye 
Variable __Area (No. of pixels) Avg. Height Ratio 


Full Open 204 7.62 2.87 
Half Open 155 6.79 3.04 
Closed 117 6.02 3.17 
Input of Location of the Grayscale 
positioned face }— eye TEL. image aN 
image l N 
Output of eye Hough Sobel edge 
location ‘— transform A extraction K 
localization 


Figure 1. Schematic of the eye positioning 


Another unique indicator of fatigue in driving is yawning, which occurs when a person is tired and 
about to nod off owing to body reactions. When the mouth area is discovered using Viola Jones, the mouth 
area is split by K, meaning [11] bunching and coordinating the relationship coefficient format [12]. So, 
protests are closest to each other in each bunch, and farthest from objects in other bunches. Each K group is 
identified by its centroid. The capacity K-implies conducting K-Means grouping, so that the total of 
separations from each item to its associated group centroid, total K bunches, is a basis. The target effort is to 
acquire the base separation between classes or, more fundamentally, between pixels [13]. Figure 2 shows the 
Sobel edge process of detection of eyes, and the Figure 3 is showing the face detection framework of 
yawning detection [14]. 


Detected Eyes 


EYES 
MONITORING 


FACE DETECTION 
& SKIN 
SEGMENTATION 


YAWNING 
DETECTION 


Templates 


CONTINUE CAPTURING 


Right Left 
Figure 2. Detection of eyes Figure 3. Face detection framework 
cj = (xi|min(||xi — xj|)) (2) 
argmin Ìllcj - xjll (3) 


In (2) and (3), xi is the i” pixel, xj is the class j focal point, and cj are class j pixels. The brilliance 
power determines pixel classification. Finally, a large chunk of the image reveals the mouth and identifies 
yawning using K=2 layouts. The open and close formats are all 38x62 [15]. 

This deep learning model is taught images from a video device. Yawning, languid pace included 
sleepy, blinking head gestures, sleepy eyes. Infrared cameras were used to record the event, also night videos. 
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The result is 9.5 hours of content with 640*480 definition images at 30 frames per second. A convolution 
neural network (CNN) is made up of layers that are structured to maximize its features [16]. The arrangement 
of cortical territory is particularly stirring to CNN. Figure 4 depicts a seven-layered neural system with one 
info layer, five veiled levels employing the first layer objective, and a yield layer. It has two convolutional 
layers borrowed from Inception and a variety of pooling layers to reduce the computational bundling of 
layers [17]. Each of the thirty thousand accessible neurons corresponds to an RGB value [100, 100], 
reachable by the RGB index. The main network layer is a convolutional layer 1 with 64 channels and a 3 to 
3-pixel section. The second convolutional layer used for the convolutional classifier has 64 channels with a 
bit size of three pixels and ReLU [18]. 


Convolutional Neural Network 
Input Data Output Data 


Figure 4. Basic block diagram of CNN 


A convolution neural network (CNN) is made up of layers that are structured to maximize its 
features. The arrangement of cortical territory is particularly stirring to CNN [19]. Figure 4 depicts a 
seven-layered neural system with one info layer, five veiled levels employing the first layer objective, and a 
yield layer. It has two convolutional layers borrowed from Inception and a variety of pooling layers to reduce 
the computational bundling of layers. Each of the thirty thousand accessible neurons corresponds to an RGB 
value [100, 100], reachable by the RGB index [20]. The main network layer is a convolutional layer 1 with 
64 channels and a 3-to-3-pixel section. The second convolutional layer used for the convolutional classifier 
has 64 channels with a bit size of three pixels and ReLU [21]. 


4. RESULTS AND DISCUSSION 

Figure 5 shows a video of the driver that was captured by the camera. Finally, the video is included. 
The sections that follow will demonstrate how to keep an eye on an edge. We go over the features, 
advantages, and algorithms of a prototype system for detecting driver fatigue. It is divided into four sections: 
the process of getting things started and getting ready using eye-tracking technology to conduct research 
detection of the early warning signs the fourth stage of the alertness system in order to determine whether or 
not a driver is fatigued, we look at non-intrusive outside signals. For this project, we are investigating the use 
of framework engineering in the transition of the current prototyping system into one that can enable further 
research in this field [22]. 


DETECTING STAGE TRACKING STAGE WARNING STAGE ALERT STAGE 
DRIVER EYES CLOSED 


DETECTED FOR TOO Lg 
BLEEPINESS 
ETECTED 


LOST TRACK EYES OPENED 


Figure 5. The four stages of our drowsiness detection system 


4.1. Image pre-processing 

Pre-processing images can impair the system's accuracy. Optical pay is first adjusted using 
histograms [5], [23]. Here, we add an evening histogram to each area of the shading image, as illustrated in 
Figure 5. Then a salary image. In order to improve the framework's competence, the repaid picture's priorities 
are decreased. 
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4.2. Face detection 

Face detection is used to reduce the number of false positives in the recognition of exterior 
appearance. The positioning of the eyes and lips is critical. Make sure that the face has been marked before 
moving the image to YCbCr [11], [24]. 


4.3. Eye location and recognition 

In order to identify driver fatigue, the condition of the eye must be switched on or off. The eyelid 
muscles can help you fall asleep faster when you're drowsy. Using Viola Jones [25] to find the driver's eyes. 
At that point, separate the two eyes by their symmetry [26]. The eye's focus is set [27]. The understudy was 
identified. If the eyes are open, they are seen as normal and no warning is issued. If the eye is closed, it is 
perceived as a fatigue of caution. Edge recognition can be used to detect changes in pixel capacity [28]. Some 
approaches, like Sobel, identify edges. This method is designed to detect image alterations. The suggested 
work's Sobel edge detection strategy outperforms other techniques [29]. 

The eye's attributes are separated to determine its condition. Normally, the left eye's state is 
equivalent to the right eye's. In this manner, we consider the status of one eye in one edge. This idea also 
helps reduce computing complexity [30]. This progression uses two strategies: double mode and Canny edge 
discovery. Figure 6 shows some binary pattern of Figures 6(a) and 6(b) an open eye and Figures 6(c) and 6(d) 
closed eye. When the conversion of the eye image is completed, the height of the eyelids is utilized to 
determine the eye's state [31]. 


r = uk (4) 
_ (Lgray(X,Y) =T 
CS ere Y)<T (6) 
s€ Fo Pe 
(a) (b) (c) (da) 
Figure 6. Binary pattern of (a), (b) an open eye and (c), (d) closed eye 
G(X, Y) = I(X,Y) * Go(X,Y) (6) 
1 3 
Gg(X,Y) = e 20? (7) 


2m 


The Canny’s edge detection algorithm is well known for its ability to generate a continuous edge. 
First, the image is smoothed by Gaussian convolution [32]. Where @ can be used to adjust the scale. At this 
stage, the differential channel determines the magnitude and introduction of the edge. Edge data of various 
scales is used to obtain the final edge picture [33]. Edge focuses are summed together for the purpose of 
determining the eye's condition. Classification is done using a double support vector machine (SVM) 
classifier with a straight bit [34]. It has been used to generate video outlines using a 15 fps 5-MP camera in 
MATLAB 2017. In the suggested approach, the driver's facial weakness indications are taken into account to 
determine if they are properly executed [35]. The approach was tested in both low and high light 
circumstances in order to verify its performance. The first analysis was performed in broad daylight at a 
distance that was as close to ideal as possible. Accuracy was found to be between 85% and 95% when the 
program was run in normal daylight conditions [36]. This can be seen in Figure 7 where the percentage of 
yawns detected was higher than the percentage of eye movements detected as signs of sleepiness. It was done 
in low light and close proximity for this second analysis [37]. When compared to scenario-1 daylight, 
which had an accuracy of 75 to 80 percent, the software ran and executed with an average accuracy of 
10 to 15 percent. The percentage detection of yawning was likewise shown to be higher than the % detection 
of eye movement for drowsiness, as was previously noted [38]. Figure 7 depicts a drowsy and a normal 
sample in a similar state of alertness. The final study was done in artificial light at night with the best 
possible proximity. Compared to scenario-1 and scenario-2, the program's execution and performance was 
found to be between 90 and 93 percent accurate. As previously noted, the detection of yawning was found to 
be more accurate than the detection of eye movement as a sign of sleepiness. A drowsy sample and a normal 
one is depicted in Figure 8 for comparison. It was decided to conduct the final analysis under low-light 
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conditions and in close proximity to the samples. The program was found to have the lowest accuracy % 
compared to scenarios 1 and 2, as well as scenarios ranging from 65 to 68 percent. The percentage 
identification of yawning was also found to be better than the percentage detection of eye movement for 
tiredness, as previously reported. As depicted in Figure 8, one sample was tired, and the other was awake. 
Image capture and analysis rely heavily on proximity, and it has been found that in certain settings, the 
closest possible proximity is required to improve performance and detect eye and lip gestures [39]. We found 
that the camera and feature should be as close as possible to each other as to avoid any interference. As part 
of the first step in the detection procedure, a support vector machine classifier is used to identify the eye and 
mouth movements [40]. Table 2 shows the accuracy analysis of scenario versus trial, where percentage 
accuracy is given for trial 1 to trial 4 according to scenario 1 to scenario 4. Table 3 show the results of a 
statistical study of the accuracy % for all trials in each situation. 


close close 


close 


Figure 7. Face detection for alert and drowsy state 


Figure 8. Face segmentation of eyes and lips 


Table 2. Accuracy analysis of scenario vs trial 
Scenario Percentage Accuracy 
Trial-1_  Trial-2 Trial-3 Trial -4 
Scenario-1 91% 90 % 91.5% 92.10 % 
Scenario-2 77% 80 % 83 % 82 % 
Scenario-3 91% 92.5 % 93 % 94 % 
Scenario-4 65% 68.50% 66% 72 % 


Table 3. Average accuracy per scenario 
Scenario Percentage Accuracy 


Scenario-1 91 % 
Scenario-2 81% 
Scenario-3 93 % 
Scenario-4 68 % 
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5. CONCLUSION 

Due to the high efficiency and good performance under different circumstances, the real-time 
implementation of drowsiness detection which is invariable to illumination and performs well under various 
lighting conditions. Tracking the eyes and mouth is made simple using a design matching type of medical 
signal processing. The proposed framework achieves a high degree of accuracy in the four test cases, 
surpassing the accuracy of the approaches used in the recent past. Through using a device that is able to 
identify the aura of the fire substantially accurately, the machine will also reduce the number of casualties per 
year. With its model, the device could not say if the person was nodding off from getting their head to the 
side or if their body was slipping out from under them. The head lowering forecast might also need to be 
included within some form of threshold. The accuracy also decreases when wearing glasses. Future attempts 
will be made to make it so "swing" will continue to be the same. 
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